Covariance-based Clustering in Multivariate and Functional Data Analysis
نویسندگان
چکیده
In this paper we propose a new algorithm to perform clustering of multivariate and functional data. We study the case of two populations different in their covariances, rather than in their means. The algorithm relies on a proper quantification of distance between the estimated covariance operators of the populations, and subdivides data in two groups maximising the distance between their induced covariances. The naive implementation of such an algorithm is computationally forbidding, so we propose a heuristic formulation with a much lighter complexity and we study its convergence properties, along with its computational cost. We also propose to use an enhanced estimator for the estimation of discrete covariances of functional data, namely a linear shrinkage estimator, in order to improve the precision of the clustering. We establish the effectiveness of our algorithm through applications to both synthetic data and a real data set coming from a biomedical context, showing also how the use of shrinkage estimation may lead to substantially better results.
منابع مشابه
A Clustering Based Location-allocation Problem Considering Transportation Costs and Statistical Properties (RESEARCH NOTE)
Cluster analysis is a useful technique in multivariate statistical analysis. Different types of hierarchical cluster analysis and K-means have been used for data analysis in previous studies. However, the K-means algorithm can be improved using some metaheuristics algorithms. In this study, we propose simulated annealing based algorithm for K-means in the clustering analysis which we refer it a...
متن کاملFisher’s Linear Discriminant Analysis for Weather Data by reproducing kernel Hilbert spaces framework
Recently with science and technology development, data with functional nature are easy to collect. Hence, statistical analysis of such data is of great importance. Similar to multivariate analysis, linear combinations of random variables have a key role in functional analysis. The role of Theory of Reproducing Kernel Hilbert Spaces is very important in this content. In this paper we study a gen...
متن کاملEstimation of Climate Zone Effects on Iranian Temperature, Humidity, and Precipitation using Functional Analysis of Covariance
Functional Data Analysis (FDA) has recently made considerable progress because of easier access to the data that are essentially in the form of curves. Although functional modeling of Iranian precipitation based on temperature or humidity was done before, here we use functional analysis of variance and covariance to analyze the weather data collected randomly from Iranian weather stations in 20...
متن کاملA Clustering Approach by SSPCO Optimization Algorithm Based on Chaotic Initial Population
Assigning a set of objects to groups such that objects in one group or cluster are more similar to each other than the other clusters’ objects is the main task of clustering analysis. SSPCO optimization algorithm is anew optimization algorithm that is inspired by the behavior of a type of bird called see-see partridge. One of the things that smart algorithms are applied to solve is the problem ...
متن کاملA Comparison of Information Criteria in Clustering Based on Mixture of Multivariate Normal Distributions
Clustering analysis based on a mixture of multivariate normal distributions is commonly used in the clustering of multidimensional data sets. Model selection is one of the most important problems in mixture cluster analysis based on the mixture of multivariate normal distributions. Model selection involves the determination of the number of components (clusters) and the selection of an appropri...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of Machine Learning Research
دوره 17 شماره
صفحات -
تاریخ انتشار 2016